Overview
Brought to you by YData
Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 5410 |
| Missing cells | 370 |
| Missing cells (%) | 0.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.0 MiB |
| Average record size in memory | 387.3 B |
Variable types
| Text | 1 |
|---|---|
| Numeric | 8 |
| Categorical | 3 |
Fraud_Probability is highly overall correlated with Fraud_Probability_Percent and 1 other fields | High correlation |
Fraud_Probability_Percent is highly overall correlated with Fraud_Probability and 1 other fields | High correlation |
Top1_SHAP is highly overall correlated with Fraud_Probability and 1 other fields | High correlation |
Top3_ActualValue is highly overall correlated with Top3_Feature | High correlation |
Top3_Feature is highly overall correlated with Top3_ActualValue | High correlation |
Top1_Feature is highly imbalanced (74.2%) | Imbalance |
Top3_ActualValue has 317 (5.9%) missing values | Missing |
Provider has unique values | Unique |
Top2_ActualValue has 515 (9.5%) zeros | Zeros |
Top3_ActualValue has 693 (12.8%) zeros | Zeros |
Reproduction
| Analysis started | 2025-06-10 13:52:43.011235 |
|---|---|
| Analysis finished | 2025-06-10 13:52:56.552816 |
| Duration | 13.54 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
Provider
Text
Unique 
| Distinct | 5410 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 343.5 KiB |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 5410 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | PRV52145 |
|---|---|
| 2nd row | PRV55104 |
| 3rd row | PRV54894 |
| 4th row | PRV54927 |
| 5th row | PRV55215 |
| Value | Count | Frequency (%) |
| prv56021 | 1 | < 0.1% |
| prv55405 | 1 | < 0.1% |
| prv51787 | 1 | < 0.1% |
| prv51875 | 1 | < 0.1% |
| prv55810 | 1 | < 0.1% |
| prv56873 | 1 | < 0.1% |
| prv56381 | 1 | < 0.1% |
| prv51213 | 1 | < 0.1% |
| prv56582 | 1 | < 0.1% |
| prv52781 | 1 | < 0.1% |
| Other values (5400) | 5400 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 7865 | |
| P | 5410 | |
| R | 5410 | |
| V | 5410 | |
| 1 | 2495 | 5.8% |
| 6 | 2452 | 5.7% |
| 3 | 2448 | 5.7% |
| 2 | 2438 | 5.6% |
| 4 | 2433 | 5.6% |
| 7 | 2248 | 5.2% |
| Other values (3) | 4671 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 43280 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 5 | 7865 | |
| P | 5410 | |
| R | 5410 | |
| V | 5410 | |
| 1 | 2495 | 5.8% |
| 6 | 2452 | 5.7% |
| 3 | 2448 | 5.7% |
| 2 | 2438 | 5.6% |
| 4 | 2433 | 5.6% |
| 7 | 2248 | 5.2% |
| Other values (3) | 4671 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 43280 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 5 | 7865 | |
| P | 5410 | |
| R | 5410 | |
| V | 5410 | |
| 1 | 2495 | 5.8% |
| 6 | 2452 | 5.7% |
| 3 | 2448 | 5.7% |
| 2 | 2438 | 5.6% |
| 4 | 2433 | 5.6% |
| 7 | 2248 | 5.2% |
| Other values (3) | 4671 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 43280 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 5 | 7865 | |
| P | 5410 | |
| R | 5410 | |
| V | 5410 | |
| 1 | 2495 | 5.8% |
| 6 | 2452 | 5.7% |
| 3 | 2448 | 5.7% |
| 2 | 2438 | 5.6% |
| 4 | 2433 | 5.6% |
| 7 | 2248 | 5.2% |
| Other values (3) | 4671 |
Fraud_Probability
Real number (ℝ)
High correlation 
| Distinct | 5401 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.15013172 |
| Minimum | 0.00022823537 |
|---|---|
| Maximum | 0.99438226 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | 0.00022823537 |
|---|---|
| 5-th percentile | 0.00098103074 |
| Q1 | 0.0031666447 |
| median | 0.010029438 |
| Q3 | 0.070688568 |
| 95-th percentile | 0.93668438 |
| Maximum | 0.99438226 |
| Range | 0.99415402 |
| Interquartile range (IQR) | 0.067521923 |
Descriptive statistics
| Standard deviation | 0.29431393 |
|---|---|
| Coefficient of variation (CV) | 1.9603714 |
| Kurtosis | 2.3045603 |
| Mean | 0.15013172 |
| Median Absolute Deviation (MAD) | 0.0086339168 |
| Skewness | 1.9765938 |
| Sum | 812.2126 |
| Variance | 0.086620688 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.001222939 | 3 | 0.1% |
| 0.0009220703 | 3 | 0.1% |
| 0.0015959388 | 2 | < 0.1% |
| 0.000902706 | 2 | < 0.1% |
| 0.0004993615 | 2 | < 0.1% |
| 0.002283027 | 2 | < 0.1% |
| 0.001508889 | 2 | < 0.1% |
| 0.043603275 | 1 | < 0.1% |
| 0.0013410501 | 1 | < 0.1% |
| 0.0012702412 | 1 | < 0.1% |
| Other values (5391) | 5391 |
| Value | Count | Frequency (%) |
| 0.00022823537 | 1 | |
| 0.0002561417 | 1 | |
| 0.00026063184 | 1 | |
| 0.000264258 | 1 | |
| 0.00027018946 | 1 | |
| 0.00027636546 | 1 | |
| 0.00028678545 | 1 | |
| 0.00029498356 | 1 | |
| 0.00031468808 | 1 | |
| 0.00032562297 | 1 |
| Value | Count | Frequency (%) |
| 0.99438226 | 1 | |
| 0.99432385 | 1 | |
| 0.9941806 | 1 | |
| 0.9940916 | 1 | |
| 0.9940706 | 1 | |
| 0.9939267 | 1 | |
| 0.9933137 | 1 | |
| 0.99322236 | 1 | |
| 0.99285537 | 1 | |
| 0.9923529 | 1 |
Top1_Feature
Categorical
Imbalance 
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 505.1 KiB |
| Provider_Insurance_Claim_Reimbursement_Amt | |
|---|---|
| HospitalDuration_max | |
| prv_avg_claims_indicator | 339 |
| DeductibleAmtPaid_sum | 45 |
| ChronicCond_Heartfailure_mean | 16 |
| Other values (7) | 34 |
Length
| Max length | 42 |
|---|---|
| Median length | 42 |
| Mean length | 38.573567 |
| Min length | 17 |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Provider_Insurance_Claim_Reimbursement_Amt |
|---|---|
| 2nd row | Provider_Insurance_Claim_Reimbursement_Amt |
| 3rd row | Provider_Insurance_Claim_Reimbursement_Amt |
| 4th row | Provider_Insurance_Claim_Reimbursement_Amt |
| 5th row | Provider_Insurance_Claim_Reimbursement_Amt |
Common Values
| Value | Count | Frequency (%) |
| Provider_Insurance_Claim_Reimbursement_Amt | 4497 | |
| HospitalDuration_max | 479 | 8.9% |
| prv_avg_claims_indicator | 339 | 6.3% |
| DeductibleAmtPaid_sum | 45 | 0.8% |
| ChronicCond_Heartfailure_mean | 16 | 0.3% |
| perc_allocated_used | 14 | 0.3% |
| ClaimDuration_max | 7 | 0.1% |
| HospitalDuration_std | 7 | 0.1% |
| ChronicCond_Cancer_mean | 3 | 0.1% |
| HospitalDuration_mean | 1 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| provider_insurance_claim_reimbursement_amt | 4497 | |
| hospitalduration_max | 479 | 8.9% |
| prv_avg_claims_indicator | 339 | 6.3% |
| deductibleamtpaid_sum | 45 | 0.8% |
| chroniccond_heartfailure_mean | 16 | 0.3% |
| perc_allocated_used | 14 | 0.3% |
| claimduration_max | 7 | 0.1% |
| hospitalduration_std | 7 | 0.1% |
| chroniccond_cancer_mean | 3 | 0.1% |
| hospitalduration_mean | 1 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 22680 | |
| _ | 19615 | 9.4% |
| r | 19233 | 9.2% |
| m | 18934 | 9.1% |
| i | 15627 | 7.5% |
| n | 14390 | 6.9% |
| a | 11616 | 5.6% |
| t | 10442 | 5.0% |
| s | 9890 | 4.7% |
| u | 9609 | 4.6% |
| Other values (20) | 56647 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 208683 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 22680 | |
| _ | 19615 | 9.4% |
| r | 19233 | 9.2% |
| m | 18934 | 9.1% |
| i | 15627 | 7.5% |
| n | 14390 | 6.9% |
| a | 11616 | 5.6% |
| t | 10442 | 5.0% |
| s | 9890 | 4.7% |
| u | 9609 | 4.6% |
| Other values (20) | 56647 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 208683 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 22680 | |
| _ | 19615 | 9.4% |
| r | 19233 | 9.2% |
| m | 18934 | 9.1% |
| i | 15627 | 7.5% |
| n | 14390 | 6.9% |
| a | 11616 | 5.6% |
| t | 10442 | 5.0% |
| s | 9890 | 4.7% |
| u | 9609 | 4.6% |
| Other values (20) | 56647 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 208683 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 22680 | |
| _ | 19615 | 9.4% |
| r | 19233 | 9.2% |
| m | 18934 | 9.1% |
| i | 15627 | 7.5% |
| n | 14390 | 6.9% |
| a | 11616 | 5.6% |
| t | 10442 | 5.0% |
| s | 9890 | 4.7% |
| u | 9609 | 4.6% |
| Other values (20) | 56647 |
Top1_SHAP
Real number (ℝ)
High correlation 
| Distinct | 5401 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -1.2353862 |
| Minimum | -2.6842446 |
|---|---|
| Maximum | 2.6063855 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4470 |
| Negative (%) | 82.6% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | -2.6842446 |
|---|---|
| 5-th percentile | -2.4068657 |
| Q1 | -2.1729992 |
| median | -1.7461665 |
| Q3 | -0.92443412 |
| 95-th percentile | 1.4429291 |
| Maximum | 2.6063855 |
| Range | 5.2906301 |
| Interquartile range (IQR) | 1.2485651 |
Descriptive statistics
| Standard deviation | 1.2945363 |
|---|---|
| Coefficient of variation (CV) | -1.0478798 |
| Kurtosis | 0.66806759 |
| Mean | -1.2353862 |
| Median Absolute Deviation (MAD) | 0.53342845 |
| Skewness | 1.3444188 |
| Sum | -6683.4394 |
| Variance | 1.6758243 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -2.0813987 | 3 | 0.1% |
| -2.1875658 | 2 | < 0.1% |
| -2.183212 | 2 | < 0.1% |
| -2.2915285 | 2 | < 0.1% |
| -2.1374512 | 2 | < 0.1% |
| -2.3748865 | 2 | < 0.1% |
| -2.0690496 | 2 | < 0.1% |
| -2.1070178 | 2 | < 0.1% |
| -1.8546634 | 1 | < 0.1% |
| -2.093958 | 1 | < 0.1% |
| Other values (5391) | 5391 |
| Value | Count | Frequency (%) |
| -2.6842446 | 1 | |
| -2.681831 | 1 | |
| -2.6625078 | 1 | |
| -2.6603427 | 1 | |
| -2.6593232 | 1 | |
| -2.6525118 | 1 | |
| -2.65052 | 1 | |
| -2.6447027 | 1 | |
| -2.6397154 | 1 | |
| -2.629413 | 1 |
| Value | Count | Frequency (%) |
| 2.6063855 | 1 | |
| 2.547455 | 1 | |
| 2.511147 | 1 | |
| 2.4954474 | 1 | |
| 2.4939666 | 1 | |
| 2.4802988 | 1 | |
| 2.4797082 | 1 | |
| 2.471098 | 1 | |
| 2.4659982 | 1 | |
| 2.4599466 | 1 |
Top2_Feature
Categorical
| Distinct | 29 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 425.5 KiB |
| DeductibleAmtPaid_sum | |
|---|---|
| prv_avg_claims_indicator | |
| HospitalDuration_max | |
| ChronicCond_Heartfailure_mean | |
| Provider_Insurance_Claim_Reimbursement_Amt | |
| Other values (24) |
Length
| Max length | 42 |
|---|---|
| Median length | 39 |
| Mean length | 23.511091 |
| Min length | 17 |
Unique
| Unique | 6 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | prv_avg_claims_indicator |
|---|---|
| 2nd row | DeductibleAmtPaid_sum |
| 3rd row | HospitalDuration_max |
| 4th row | DeductibleAmtPaid_sum |
| 5th row | HospitalDuration_max |
Common Values
| Value | Count | Frequency (%) |
| DeductibleAmtPaid_sum | 1720 | |
| prv_avg_claims_indicator | 1569 | |
| HospitalDuration_max | 1089 | |
| ChronicCond_Heartfailure_mean | 286 | 5.3% |
| Provider_Insurance_Claim_Reimbursement_Amt | 268 | 5.0% |
| Avg_InscClaimAmtReimbursed_Per_Provider | 96 | 1.8% |
| HospitalDuration_std | 86 | 1.6% |
| ClaimDuration_max | 74 | 1.4% |
| ChronicCond_KidneyDisease_mean | 64 | 1.2% |
| ChronicCond_Cancer_mean | 45 | 0.8% |
| Other values (19) | 113 | 2.1% |
Length
| Value | Count | Frequency (%) |
| deductibleamtpaid_sum | 1720 | |
| prv_avg_claims_indicator | 1569 | |
| hospitalduration_max | 1089 | |
| chroniccond_heartfailure_mean | 286 | 5.3% |
| provider_insurance_claim_reimbursement_amt | 268 | 5.0% |
| avg_inscclaimamtreimbursed_per_provider | 96 | 1.8% |
| hospitalduration_std | 86 | 1.6% |
| claimduration_max | 74 | 1.4% |
| chroniccond_kidneydisease_mean | 64 | 1.2% |
| chroniccond_cancer_mean | 45 | 0.8% |
| Other values (19) | 113 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 12689 | 10.0% |
| a | 12030 | 9.5% |
| _ | 9999 | 7.9% |
| t | 8580 | 6.7% |
| m | 8143 | 6.4% |
| r | 7002 | 5.5% |
| e | 6561 | 5.2% |
| d | 6102 | 4.8% |
| c | 5745 | 4.5% |
| u | 5728 | 4.5% |
| Other values (23) | 44616 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 127195 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 12689 | 10.0% |
| a | 12030 | 9.5% |
| _ | 9999 | 7.9% |
| t | 8580 | 6.7% |
| m | 8143 | 6.4% |
| r | 7002 | 5.5% |
| e | 6561 | 5.2% |
| d | 6102 | 4.8% |
| c | 5745 | 4.5% |
| u | 5728 | 4.5% |
| Other values (23) | 44616 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 127195 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 12689 | 10.0% |
| a | 12030 | 9.5% |
| _ | 9999 | 7.9% |
| t | 8580 | 6.7% |
| m | 8143 | 6.4% |
| r | 7002 | 5.5% |
| e | 6561 | 5.2% |
| d | 6102 | 4.8% |
| c | 5745 | 4.5% |
| u | 5728 | 4.5% |
| Other values (23) | 44616 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 127195 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 12689 | 10.0% |
| a | 12030 | 9.5% |
| _ | 9999 | 7.9% |
| t | 8580 | 6.7% |
| m | 8143 | 6.4% |
| r | 7002 | 5.5% |
| e | 6561 | 5.2% |
| d | 6102 | 4.8% |
| c | 5745 | 4.5% |
| u | 5728 | 4.5% |
| Other values (23) | 44616 |
Top2_SHAP
Real number (ℝ)
| Distinct | 5399 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.48001872 |
| Minimum | -1.3274089 |
|---|---|
| Maximum | 1.228544 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4593 |
| Negative (%) | 84.9% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | -1.3274089 |
|---|---|
| 5-th percentile | -0.98094258 |
| Q1 | -0.73744841 |
| median | -0.63287482 |
| Q3 | -0.5258627 |
| 95-th percentile | 0.70683187 |
| Maximum | 1.228544 |
| Range | 2.5559529 |
| Interquartile range (IQR) | 0.21158571 |
Descriptive statistics
| Standard deviation | 0.50587635 |
|---|---|
| Coefficient of variation (CV) | -1.053868 |
| Kurtosis | 1.4823705 |
| Mean | -0.48001872 |
| Median Absolute Deviation (MAD) | 0.10559684 |
| Skewness | 1.6475048 |
| Sum | -2596.9013 |
| Variance | 0.25591088 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -0.6204952 | 3 | 0.1% |
| -0.55736446 | 2 | < 0.1% |
| -0.61724293 | 2 | < 0.1% |
| -0.52102983 | 2 | < 0.1% |
| -0.79280275 | 2 | < 0.1% |
| -0.63384634 | 2 | < 0.1% |
| -0.54998827 | 2 | < 0.1% |
| -0.7687515 | 2 | < 0.1% |
| -0.51584965 | 2 | < 0.1% |
| -0.71805626 | 2 | < 0.1% |
| Other values (5389) | 5389 |
| Value | Count | Frequency (%) |
| -1.3274089 | 1 | |
| -1.3216667 | 1 | |
| -1.2959567 | 1 | |
| -1.271386 | 1 | |
| -1.2682517 | 1 | |
| -1.2627395 | 1 | |
| -1.2499744 | 1 | |
| -1.2422806 | 1 | |
| -1.2310311 | 1 | |
| -1.2197806 | 1 |
| Value | Count | Frequency (%) |
| 1.228544 | 1 | |
| 1.157284 | 1 | |
| 1.1402673 | 1 | |
| 1.1113061 | 1 | |
| 1.088396 | 1 | |
| 1.0832257 | 1 | |
| 1.0674474 | 1 | |
| 1.0607847 | 1 | |
| 1.0555196 | 1 | |
| 1.0528997 | 1 |
Top3_Feature
Categorical
High correlation 
| Distinct | 36 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 434.3 KiB |
| DeductibleAmtPaid_sum | |
|---|---|
| prv_avg_claims_indicator | |
| HospitalDuration_max | |
| Avg_InscClaimAmtReimbursed_Per_Provider | |
| ChronicCond_Heartfailure_mean | |
| Other values (31) |
Length
| Max length | 42 |
|---|---|
| Median length | 36 |
| Mean length | 25.175231 |
| Min length | 17 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | DeductibleAmtPaid_sum |
|---|---|
| 2nd row | HospitalDuration_max |
| 3rd row | OperatingPhysician_nunique |
| 4th row | prv_avg_claims_indicator |
| 5th row | HospitalDuration_sum |
Common Values
| Value | Count | Frequency (%) |
| DeductibleAmtPaid_sum | 862 | |
| prv_avg_claims_indicator | 834 | |
| HospitalDuration_max | 729 | |
| Avg_InscClaimAmtReimbursed_Per_Provider | 545 | |
| ChronicCond_Heartfailure_mean | 514 | |
| ChronicCond_KidneyDisease_mean | 342 | 6.3% |
| HospitalDuration_std | 306 | 5.7% |
| ClaimDuration_max | 284 | 5.2% |
| Provider_Insurance_Claim_Reimbursement_Amt | 211 | 3.9% |
| ChronicCond_Cancer_mean | 124 | 2.3% |
| Other values (26) | 659 |
Length
| Value | Count | Frequency (%) |
| deductibleamtpaid_sum | 862 | |
| prv_avg_claims_indicator | 834 | |
| hospitalduration_max | 729 | |
| avg_inscclaimamtreimbursed_per_provider | 545 | |
| chroniccond_heartfailure_mean | 514 | |
| chroniccond_kidneydisease_mean | 342 | 6.3% |
| hospitalduration_std | 306 | 5.7% |
| claimduration_max | 284 | 5.2% |
| provider_insurance_claim_reimbursement_amt | 211 | 3.9% |
| chroniccond_cancer_mean | 124 | 2.3% |
| Other values (26) | 659 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 12326 | 9.1% |
| a | 11800 | 8.7% |
| _ | 10079 | 7.4% |
| e | 9306 | 6.8% |
| r | 8877 | 6.5% |
| m | 8219 | 6.0% |
| n | 7871 | 5.8% |
| t | 7605 | 5.6% |
| o | 6886 | 5.1% |
| d | 6073 | 4.5% |
| Other values (24) | 47156 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 136198 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 12326 | 9.1% |
| a | 11800 | 8.7% |
| _ | 10079 | 7.4% |
| e | 9306 | 6.8% |
| r | 8877 | 6.5% |
| m | 8219 | 6.0% |
| n | 7871 | 5.8% |
| t | 7605 | 5.6% |
| o | 6886 | 5.1% |
| d | 6073 | 4.5% |
| Other values (24) | 47156 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 136198 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 12326 | 9.1% |
| a | 11800 | 8.7% |
| _ | 10079 | 7.4% |
| e | 9306 | 6.8% |
| r | 8877 | 6.5% |
| m | 8219 | 6.0% |
| n | 7871 | 5.8% |
| t | 7605 | 5.6% |
| o | 6886 | 5.1% |
| d | 6073 | 4.5% |
| Other values (24) | 47156 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 136198 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 12326 | 9.1% |
| a | 11800 | 8.7% |
| _ | 10079 | 7.4% |
| e | 9306 | 6.8% |
| r | 8877 | 6.5% |
| m | 8219 | 6.0% |
| n | 7871 | 5.8% |
| t | 7605 | 5.6% |
| o | 6886 | 5.1% |
| d | 6073 | 4.5% |
| Other values (24) | 47156 |
Top3_SHAP
Real number (ℝ)
| Distinct | 5398 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.37359158 |
| Minimum | -1.1483903 |
|---|---|
| Maximum | 0.9962663 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4670 |
| Negative (%) | 86.3% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | -1.1483903 |
|---|---|
| 5-th percentile | -0.75700851 |
| Q1 | -0.5617617 |
| median | -0.47057586 |
| Q3 | -0.35199756 |
| 95-th percentile | 0.50379438 |
| Maximum | 0.9962663 |
| Range | 2.1446566 |
| Interquartile range (IQR) | 0.20976415 |
Descriptive statistics
| Standard deviation | 0.36255592 |
|---|---|
| Coefficient of variation (CV) | -0.97046063 |
| Kurtosis | 2.0318449 |
| Mean | -0.37359158 |
| Median Absolute Deviation (MAD) | 0.10233378 |
| Skewness | 1.6311461 |
| Sum | -2021.1304 |
| Variance | 0.13144679 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -0.5617617 | 3 | 0.1% |
| -0.50767875 | 3 | 0.1% |
| -0.5406518 | 2 | < 0.1% |
| -0.53682566 | 2 | < 0.1% |
| 0.50562495 | 2 | < 0.1% |
| -0.5329638 | 2 | < 0.1% |
| -0.47945094 | 2 | < 0.1% |
| -0.40415084 | 2 | < 0.1% |
| -0.55073494 | 2 | < 0.1% |
| -0.5959276 | 2 | < 0.1% |
| Other values (5388) | 5388 |
| Value | Count | Frequency (%) |
| -1.1483903 | 1 | |
| -1.1401569 | 1 | |
| -1.1099311 | 1 | |
| -1.0998387 | 1 | |
| -1.0993729 | 1 | |
| -1.0891747 | 1 | |
| -1.0634226 | 1 | |
| -1.0391371 | 1 | |
| -1.0390397 | 1 | |
| -1.0369247 | 1 |
| Value | Count | Frequency (%) |
| 0.9962663 | 1 | |
| 0.9546397 | 1 | |
| 0.9452925 | 1 | |
| 0.906611 | 1 | |
| 0.9030236 | 1 | |
| 0.8917418 | 1 | |
| 0.8745094 | 1 | |
| 0.86896515 | 1 | |
| 0.85832447 | 1 | |
| 0.85512525 | 1 |
Fraud_Probability_Percent
Real number (ℝ)
High correlation 
| Distinct | 5401 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.013172 |
| Minimum | 0.022823537 |
|---|---|
| Maximum | 99.438225 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | 0.022823537 |
|---|---|
| 5-th percentile | 0.098103074 |
| Q1 | 0.31666447 |
| median | 1.0029438 |
| Q3 | 7.0688569 |
| 95-th percentile | 93.668436 |
| Maximum | 99.438225 |
| Range | 99.415401 |
| Interquartile range (IQR) | 6.7521924 |
Descriptive statistics
| Standard deviation | 29.431393 |
|---|---|
| Coefficient of variation (CV) | 1.9603714 |
| Kurtosis | 2.3045603 |
| Mean | 15.013172 |
| Median Absolute Deviation (MAD) | 0.86339175 |
| Skewness | 1.9765938 |
| Sum | 81221.26 |
| Variance | 866.20688 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.1222939 | 3 | 0.1% |
| 0.09220703 | 3 | 0.1% |
| 0.15959388 | 2 | < 0.1% |
| 0.0902706 | 2 | < 0.1% |
| 0.04993615 | 2 | < 0.1% |
| 0.22830269 | 2 | < 0.1% |
| 0.1508889 | 2 | < 0.1% |
| 4.3603277 | 1 | < 0.1% |
| 0.13410501 | 1 | < 0.1% |
| 0.12702413 | 1 | < 0.1% |
| Other values (5391) | 5391 |
| Value | Count | Frequency (%) |
| 0.022823537 | 1 | |
| 0.02561417 | 1 | |
| 0.026063185 | 1 | |
| 0.026425801 | 1 | |
| 0.027018946 | 1 | |
| 0.027636547 | 1 | |
| 0.028678546 | 1 | |
| 0.029498355 | 1 | |
| 0.03146881 | 1 | |
| 0.032562297 | 1 |
| Value | Count | Frequency (%) |
| 99.438225 | 1 | |
| 99.43239 | 1 | |
| 99.41806 | 1 | |
| 99.40916 | 1 | |
| 99.40706 | 1 | |
| 99.39267 | 1 | |
| 99.331375 | 1 | |
| 99.322235 | 1 | |
| 99.28554 | 1 | |
| 99.23529 | 1 |
Top1_ActualValue
Real number (ℝ)
| Distinct | 3077 |
|---|---|
| Distinct (%) | 57.1% |
| Missing | 18 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 84306.799 |
| Minimum | 0 |
|---|---|
| Maximum | 5996050 |
| Zeros | 16 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1.0740455 |
| Q1 | 607.5 |
| median | 7305 |
| Q3 | 31645 |
| 95-th percentile | 465012 |
| Maximum | 5996050 |
| Range | 5996050 |
| Interquartile range (IQR) | 31037.5 |
Descriptive statistics
| Standard deviation | 271187.86 |
|---|---|
| Coefficient of variation (CV) | 3.2166784 |
| Kurtosis | 88.054632 |
| Mean | 84306.799 |
| Median Absolute Deviation (MAD) | 7293 |
| Skewness | 7.3618808 |
| Sum | 4.5458226 × 108 |
| Variance | 7.3542854 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 176 | 3.3% |
| 10 | 43 | 0.8% |
| 13 | 38 | 0.7% |
| 35 | 36 | 0.7% |
| 9 | 30 | 0.6% |
| 6 | 28 | 0.5% |
| 11 | 27 | 0.5% |
| 12 | 26 | 0.5% |
| 100 | 26 | 0.5% |
| 15 | 23 | 0.4% |
| Other values (3067) | 4939 |
| Value | Count | Frequency (%) |
| 0 | 16 | |
| 0.2767857143 | 1 | < 0.1% |
| 0.2857142857 | 1 | < 0.1% |
| 0.3225806452 | 1 | < 0.1% |
| 0.3666666667 | 1 | < 0.1% |
| 0.435483871 | 1 | < 0.1% |
| 0.4489795918 | 1 | < 0.1% |
| 0.45 | 1 | < 0.1% |
| 0.4556962025 | 1 | < 0.1% |
| 0.4615384615 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 5996050 | 1 | |
| 4713830 | 1 | |
| 3212000 | 1 | |
| 3133880 | 1 | |
| 2969530 | 1 | |
| 2914700 | 1 | |
| 2831940 | 1 | |
| 2744870 | 1 | |
| 2612740 | 1 | |
| 2540130 | 1 |
Top2_ActualValue
Real number (ℝ)
Zeros 
| Distinct | 1125 |
|---|---|
| Distinct (%) | 20.9% |
| Missing | 35 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5760.4553 |
| Minimum | 0 |
|---|---|
| Maximum | 292220 |
| Zeros | 515 |
| Zeros (%) | 9.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 5 |
| Q3 | 90 |
| 95-th percentile | 32452 |
| Maximum | 292220 |
| Range | 292220 |
| Interquartile range (IQR) | 89 |
Descriptive statistics
| Standard deviation | 27690.13 |
|---|---|
| Coefficient of variation (CV) | 4.8069342 |
| Kurtosis | 35.52537 |
| Mean | 5760.4553 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 5.7580468 |
| Sum | 30962447 |
| Variance | 7.6674327 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1257 | |
| 0 | 515 | 9.5% |
| 35 | 188 | 3.5% |
| 100 | 86 | 1.6% |
| 10 | 68 | 1.3% |
| 20 | 66 | 1.2% |
| 5 | 57 | 1.1% |
| 80 | 55 | 1.0% |
| 4 | 52 | 1.0% |
| 40 | 50 | 0.9% |
| Other values (1115) | 2981 |
| Value | Count | Frequency (%) |
| 0 | 515 | |
| 0.03125 | 1 | < 0.1% |
| 0.03703703704 | 1 | < 0.1% |
| 0.05263157895 | 1 | < 0.1% |
| 0.07484407484 | 1 | < 0.1% |
| 0.08232445521 | 1 | < 0.1% |
| 0.08333333333 | 1 | < 0.1% |
| 0.08474576271 | 1 | < 0.1% |
| 0.1 | 1 | < 0.1% |
| 0.1052631579 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 292220 | 1 | |
| 268340 | 1 | |
| 259850 | 1 | |
| 255250 | 1 | |
| 240840 | 1 | |
| 237930 | 1 | |
| 237000 | 1 | |
| 236720 | 1 | |
| 226800 | 1 | |
| 224960 | 1 |
Top3_ActualValue
Real number (ℝ)
High correlation  Missing  Zeros 
| Distinct | 2295 |
|---|---|
| Distinct (%) | 45.1% |
| Missing | 317 |
| Missing (%) | 5.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5489.0327 |
| Minimum | -102073.04 |
|---|---|
| Maximum | 638766.96 |
| Zeros | 693 |
| Zeros (%) | 12.8% |
| Negative | 10 |
| Negative (%) | 0.2% |
| Memory size | 42.4 KiB |
Quantile statistics
| Minimum | -102073.04 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.41666667 |
| median | 1.1296296 |
| Q3 | 70 |
| 95-th percentile | 34068 |
| Maximum | 638766.96 |
| Range | 740840 |
| Interquartile range (IQR) | 69.583333 |
Descriptive statistics
| Standard deviation | 26612.993 |
|---|---|
| Coefficient of variation (CV) | 4.848394 |
| Kurtosis | 92.627861 |
| Mean | 5489.0327 |
| Median Absolute Deviation (MAD) | 1.1296296 |
| Skewness | 7.2963463 |
| Sum | 27955643 |
| Variance | 7.0825141 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 693 | 12.8% |
| 1 | 310 | 5.7% |
| 20 | 120 | 2.2% |
| 0.5 | 95 | 1.8% |
| 0.3333333333 | 51 | 0.9% |
| 10 | 47 | 0.9% |
| 100 | 46 | 0.9% |
| 35 | 46 | 0.9% |
| 9 | 38 | 0.7% |
| 8 | 35 | 0.6% |
| Other values (2285) | 3612 | |
| (Missing) | 317 | 5.9% |
| Value | Count | Frequency (%) |
| -102073.0388 | 1 | |
| -102063.0388 | 1 | |
| -100123.0388 | 1 | |
| -99183.03882 | 1 | |
| -96713.03882 | 1 | |
| -94803.03882 | 1 | |
| -89573.03882 | 1 | |
| -84943.03882 | 1 | |
| -80153.03882 | 1 | |
| -77943.03882 | 1 |
| Value | Count | Frequency (%) |
| 638766.9612 | 1 | |
| 305026.9612 | 1 | |
| 298506.9612 | 1 | |
| 271910 | 1 | |
| 239050 | 1 | |
| 235000 | 1 | |
| 229780 | 1 | |
| 218640 | 1 | |
| 213380 | 1 | |
| 210190 | 1 |
Interactions
Correlations
| Fraud_Probability | Fraud_Probability_Percent | Top1_ActualValue | Top1_Feature | Top1_SHAP | Top2_ActualValue | Top2_Feature | Top2_SHAP | Top3_ActualValue | Top3_Feature | Top3_SHAP | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Fraud_Probability | 1.000 | 1.000 | 0.351 | 0.140 | 0.801 | 0.466 | 0.231 | 0.394 | 0.403 | 0.228 | 0.469 |
| Fraud_Probability_Percent | 1.000 | 1.000 | 0.351 | 0.140 | 0.801 | 0.466 | 0.231 | 0.394 | 0.403 | 0.228 | 0.469 |
| Top1_ActualValue | 0.351 | 0.351 | 1.000 | 0.000 | 0.351 | 0.220 | 0.091 | 0.200 | 0.136 | 0.187 | 0.204 |
| Top1_Feature | 0.140 | 0.140 | 0.000 | 1.000 | 0.300 | 0.201 | 0.309 | 0.140 | 0.202 | 0.193 | 0.085 |
| Top1_SHAP | 0.801 | 0.801 | 0.351 | 0.300 | 1.000 | 0.404 | 0.274 | 0.240 | 0.410 | 0.295 | 0.224 |
| Top2_ActualValue | 0.466 | 0.466 | 0.220 | 0.201 | 0.404 | 1.000 | 0.334 | 0.182 | 0.174 | 0.078 | 0.241 |
| Top2_Feature | 0.231 | 0.231 | 0.091 | 0.309 | 0.274 | 0.334 | 1.000 | 0.345 | 0.067 | 0.192 | 0.204 |
| Top2_SHAP | 0.394 | 0.394 | 0.200 | 0.140 | 0.240 | 0.182 | 0.345 | 1.000 | 0.115 | 0.239 | 0.402 |
| Top3_ActualValue | 0.403 | 0.403 | 0.136 | 0.202 | 0.410 | 0.174 | 0.067 | 0.115 | 1.000 | 0.563 | 0.162 |
| Top3_Feature | 0.228 | 0.228 | 0.187 | 0.193 | 0.295 | 0.078 | 0.192 | 0.239 | 0.563 | 1.000 | 0.357 |
| Top3_SHAP | 0.469 | 0.469 | 0.204 | 0.085 | 0.224 | 0.241 | 0.204 | 0.402 | 0.162 | 0.357 | 1.000 |
Missing values
Sample
| Provider | Fraud_Probability | Top1_Feature | Top1_SHAP | Top2_Feature | Top2_SHAP | Top3_Feature | Top3_SHAP | Fraud_Probability_Percent | Top1_ActualValue | Top2_ActualValue | Top3_ActualValue | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | PRV52145 | 0.018644 | Provider_Insurance_Claim_Reimbursement_Amt | -1.030070 | prv_avg_claims_indicator | -0.586857 | DeductibleAmtPaid_sum | -0.498692 | 1.864357 | 60910.0 | 1.038136 | 650.000000 |
| 1 | PRV55104 | 0.016717 | Provider_Insurance_Claim_Reimbursement_Amt | -1.947104 | DeductibleAmtPaid_sum | -0.676061 | HospitalDuration_max | -0.263112 | 1.671673 | 14880.0 | 380.000000 | NaN |
| 2 | PRV54894 | 0.982605 | Provider_Insurance_Claim_Reimbursement_Amt | 2.280505 | HospitalDuration_max | 0.655633 | OperatingPhysician_nunique | 0.234211 | 98.260470 | 1757060.0 | 35.000000 | 31.000000 |
| 3 | PRV54927 | 0.003985 | Provider_Insurance_Claim_Reimbursement_Amt | -1.784703 | DeductibleAmtPaid_sum | -0.727324 | prv_avg_claims_indicator | -0.436268 | 0.398518 | 22600.0 | 50.000000 | 1.085106 |
| 4 | PRV55215 | 0.983695 | Provider_Insurance_Claim_Reimbursement_Amt | 2.404095 | HospitalDuration_max | 0.535668 | HospitalDuration_sum | -0.256319 | 98.369470 | 2284560.0 | 34.000000 | 792.000000 |
| 5 | PRV54950 | 0.794330 | HospitalDuration_max | 1.093456 | prv_avg_claims_indicator | -0.617702 | HospitalDuration_std | 0.605565 | 79.432980 | 27.0 | 1.000000 | 10.059821 |
| 6 | PRV57317 | 0.981709 | Provider_Insurance_Claim_Reimbursement_Amt | 2.511147 | HospitalDuration_max | 0.570222 | ClmAdmitDiagnosisCode_Count | 0.275034 | 98.170900 | 926330.0 | 22.000000 | 164.000000 |
| 7 | PRV57333 | 0.146077 | Provider_Insurance_Claim_Reimbursement_Amt | 0.836762 | HospitalDuration_max | -0.409759 | ChronicCond_Heartfailure_mean | -0.386471 | 14.607726 | 291020.0 | NaN | 0.527888 |
| 8 | PRV53100 | 0.061861 | Provider_Insurance_Claim_Reimbursement_Amt | -0.825458 | HospitalDuration_max | -0.697472 | HospitalDuration_std | -0.451650 | 6.186097 | 86880.0 | 9.000000 | 3.847077 |
| 9 | PRV56021 | 0.014879 | Provider_Insurance_Claim_Reimbursement_Amt | -1.280189 | DeductibleAmtPaid_sum | -0.691491 | prv_avg_claims_indicator | -0.470946 | 1.487865 | 39920.0 | 150.000000 | 1.032258 |
| Provider | Fraud_Probability | Top1_Feature | Top1_SHAP | Top2_Feature | Top2_SHAP | Top3_Feature | Top3_SHAP | Fraud_Probability_Percent | Top1_ActualValue | Top2_ActualValue | Top3_ActualValue | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5400 | PRV52804 | 0.001116 | Provider_Insurance_Claim_Reimbursement_Amt | -1.135756 | prv_avg_claims_indicator | -0.707482 | HospitalDuration_max | -0.531498 | 0.111566 | 36000.0 | 1.0 | 13.0 |
| 5401 | PRV55637 | 0.011534 | Provider_Insurance_Claim_Reimbursement_Amt | -1.995208 | prv_avg_claims_indicator | -0.746210 | ChronicCond_KidneyDisease_mean | 0.619416 | 1.153426 | 300.0 | 1.0 | 1.0 |
| 5402 | PRV57758 | 0.002682 | Provider_Insurance_Claim_Reimbursement_Amt | -2.015305 | prv_avg_claims_indicator | -0.684711 | DeductibleAmtPaid_sum | -0.465172 | 0.268191 | 110.0 | 1.0 | 0.0 |
| 5403 | PRV57655 | 0.000462 | Provider_Insurance_Claim_Reimbursement_Amt | -2.217601 | prv_avg_claims_indicator | -0.706257 | ChronicCond_KidneyDisease_mean | -0.595724 | 0.046188 | 400.0 | 1.0 | 0.0 |
| 5404 | PRV54295 | 0.001375 | Provider_Insurance_Claim_Reimbursement_Amt | -2.116444 | prv_avg_claims_indicator | -0.791678 | ChronicCond_KidneyDisease_mean | -0.549422 | 0.137460 | 3300.0 | 1.0 | 0.0 |
| 5405 | PRV56056 | 0.000903 | Provider_Insurance_Claim_Reimbursement_Amt | -2.128856 | prv_avg_claims_indicator | -0.718056 | ChronicCond_Heartfailure_mean | -0.595928 | 0.090271 | 40.0 | 1.0 | 0.0 |
| 5406 | PRV54820 | 0.008429 | Provider_Insurance_Claim_Reimbursement_Amt | -1.418610 | prv_avg_claims_indicator | -0.967576 | HospitalDuration_max | -0.602500 | 0.842928 | 12000.0 | 1.0 | 2.0 |
| 5407 | PRV56029 | 0.000554 | Provider_Insurance_Claim_Reimbursement_Amt | -2.156970 | prv_avg_claims_indicator | -0.591558 | ChronicCond_KidneyDisease_mean | -0.561762 | 0.055379 | 100.0 | 1.0 | 0.0 |
| 5408 | PRV51751 | 0.001722 | Provider_Insurance_Claim_Reimbursement_Amt | -2.254718 | prv_avg_claims_indicator | -0.734502 | DeductibleAmtPaid_sum | -0.449137 | 0.172225 | 900.0 | 1.0 | 0.0 |
| 5409 | PRV55405 | 0.011111 | Provider_Insurance_Claim_Reimbursement_Amt | -2.017444 | ChronicCond_Heartfailure_mean | -0.737984 | DeductibleAmtPaid_sum | -0.537528 | 1.111141 | 2730.0 | 0.0 | 200.0 |